AITopics | consistency measure

Collaborating Authors

consistency measure

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

df7e148cabfd9b608090fa5ee3348bfe-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 13:06:34 GMT

ablsim, knowledge base, reasoning, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > China > Jiangsu Province > Nanjing (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Fast Abductive Learning by Similarity-based Consistency Optimization Y u-Xuan Huang

Neural Information Processing SystemsAug-18-2025, 01:05:19 GMT

However, to enable effective abduction, previous approaches need an initialized perception model that discriminates the input raw instances.

artificial intelligence, knowledge base, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > China > Jiangsu Province > Nanjing (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Quantifying Prediction Consistency Under Model Multiplicity in Tabular LLMs

Hamman, Faisal, Dissanayake, Pasan, Mishra, Saumitra, Lecue, Freddy, Dutta, Sanghamitra

arXiv.org Machine LearningJul-4-2024

Fine-tuning large language models (LLMs) on limited tabular data for classification tasks can lead to \textit{fine-tuning multiplicity}, where equally well-performing models make conflicting predictions on the same inputs due to variations in the training process (i.e., seed, random weight initialization, retraining on additional or deleted samples). This raises critical concerns about the robustness and reliability of Tabular LLMs, particularly when deployed for high-stakes decision-making, such as finance, hiring, education, healthcare, etc. This work formalizes the challenge of fine-tuning multiplicity in Tabular LLMs and proposes a novel metric to quantify the robustness of individual predictions without expensive model retraining. Our metric quantifies a prediction's stability by analyzing (sampling) the model's local behavior around the input in the embedding space. Interestingly, we show that sampling in the local neighborhood can be leveraged to provide probabilistic robustness guarantees against a broad class of fine-tuned models. By leveraging Bernstein's Inequality, we show that predictions with sufficiently high robustness (as defined by our measure) will remain consistent with high probability. We also provide empirical evaluation on real-world datasets to support our theoretical results. Our work highlights the importance of addressing fine-tuning instabilities to enable trustworthy deployment of LLMs in high-stakes and safety-critical applications.

fine-tuned model, multiplicity, prediction, (14 more...)

arXiv.org Machine Learning

2407.04173

Country: North America > United States > Maryland > Prince George's County > College Park (0.04)

Genre: Research Report (0.82)

Industry:

Banking & Finance (1.00)
Information Technology > Security & Privacy (0.93)
Health & Medicine (0.89)
Law (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Measuring Reliability of Large Language Models through Semantic Consistency

Raj, Harsh, Rosati, Domenic, Majumdar, Subhabrata

arXiv.org Artificial IntelligenceApr-11-2023

While large pretrained language models (PLMs) demonstrate incredible fluency and performance on many natural language tasks, recent work has shown that well-performing PLMs are very sensitive to what prompts are feed into them. Even when prompts are semantically identical, language models may give very different answers. When considering safe and trustworthy deployments of PLMs we would like their outputs to be consistent under prompts that mean the same thing or convey the same intent. While some work has looked into how state-of-the-art PLMs address this need, they have been limited to only evaluating lexical equality of single- or multi-word answers and do not address consistency of generative text sequences. In order to understand consistency of PLMs under text generation settings, we develop a measure of semantic consistency that allows the comparison of open-ended text outputs. We implement several versions of this consistency metric to evaluate the performance of a number of PLMs on paraphrased versions of questions in the TruthfulQA dataset, we find that our proposed metrics are considerably more consistent than traditional metrics embodying lexical consistency, and also correlate with human evaluation of output consistency to a higher degree.

consistency, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2211.05853

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(4 more...)

Genre: Research Report (0.43)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.83)

Add feedback

ST-CoNAL: Consistency-Based Acquisition Criterion Using Temporal Self-Ensemble for Active Learning

Baik, Jae Soon, Yoon, In Young, Choi, Jun Won

arXiv.org Artificial IntelligenceOct-16-2022

Modern deep learning has achieved great success in various fields. However, it requires the labeling of huge amounts of data, which is expensive and labor-intensive. Active learning (AL), which identifies the most informative samples to be labeled, is becoming increasingly important to maximize the efficiency of the training process. The existing AL methods mostly use only a single final fixed model for acquiring the samples to be labeled. This strategy may not be good enough in that the structural uncertainty of a model for given training data is not considered to acquire the samples. In this study, we propose a novel acquisition criterion based on temporal self-ensemble generated by conventional stochastic gradient descent (SGD) optimization. These self-ensemble models are obtained by capturing the intermediate network weights obtained through SGD iterations. Our acquisition function relies on a consistency measure between the student and teacher models. The student models are given a fixed number of temporal self-ensemble models, and the teacher model is constructed by averaging the weights of the student models. Using the proposed acquisition criterion, we present an AL algorithm, namely student-teacher consistency-based AL (ST-CoNAL). Experiments conducted for image classification tasks on CIFAR-10, CIFAR-100, Caltech-256, and Tiny ImageNet datasets demonstrate that the proposed ST-CoNAL achieves significantly better performance than the existing acquisition methods. Furthermore, extensive experiments show the robustness and effectiveness of our methods.

artificial intelligence, machine learning, st-conal, (18 more...)

arXiv.org Artificial Intelligence

2207.02182

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > California (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.91)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Consistency Measures for Feature Selection: A Formal Definition, Relative Sensitivity Comparison and a Fast Algorithm

Shin, Kilho (University of Hyogo) | Fernandes, Danny (University of Hyogo) | Miyazaki, Seiya (Panasonic Corporation)

AAAI ConferencesJul-19-2011

Consistency-based feature selection is an important category of feature selection research yet is defined only intuitively in the literature. First, we formally define a consistency measure, and then using this definition, evaluate 19 feature selection measures from the literature. While only 5 of these were labeledas consistency measures by their original authors, by our definition, an additional 9 measures should be classified as consistency measures. To compare these 14 consistency measures in terms of sensitivity, we introduce the concept of quasilinear compatibility order, and partially determine the order among the measures. Next, we proposea new fast algorithm for consistency-based feature selection. We ran experiments using eleven large datasets to compare the performance of our algorithm against INTERACT and LCC, the only two instances of consistency-based algorithms with potential real world application. Our algorithm shows vast improvement in time efficiency, while its performance in accuracy is comparable with that of INTERACT and LCC.

algorithm, consistency measure, dataset, (12 more...)

AAAI Conferences

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
Asia > Japan > Honshū > Kansai > Hyogo Prefecture > Kobe (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Softening Discrete Relaxation

Finch, Andrew M., Wilson, Richard C., Hancock, Edwin R.

Neural Information Processing SystemsDec-31-1997

This paper describes a new framework for relational graph matching. The starting point is a recently reported Bayesian consistency measure which gauges structural differences using Hamming distance. The main contributions of the work are threefold. Firstly, we demonstrate how the discrete components of the cost function can be softened. The second contribution is to show how the softened cost function can be used to locate matches using continuous nonlinear optimisation. Finally, we show how the resulting graph matching algorithm relates to the standard quadratic assignment problem. 1 Introduction Graph matching [6, 5, 7, 2, 3, 12, 11J is a topic of central importance in pattern perception. The main computational issues are how to compare inexact relational descriptions (7J and how to search efficiently for the best match [8J. These two issues have recently stimulated interest in the connectionist literature (9, 6, 5, lOJ. For instance, Simic [9], Suganathan et al. (101 and Gold et ai.

equation, graph, hamming distance, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.90)
Information Technology > Artificial Intelligence > Machine Learning (0.70)

Add feedback

Softening Discrete Relaxation

Finch, Andrew M., Wilson, Richard C., Hancock, Edwin R.

Neural Information Processing SystemsDec-31-1997

equation, graph, hamming distance, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.90)
Information Technology > Artificial Intelligence > Machine Learning (0.70)

Add feedback

Softening Discrete Relaxation

Finch, Andrew M., Wilson, Richard C., Hancock, Edwin R.

Neural Information Processing SystemsDec-31-1997

This paper describes a new framework for relational graph matching. Thestarting point is a recently reported Bayesian consistency measure which gauges structural differences using Hamming distance. Themain contributions of the work are threefold. Firstly, we demonstrate how the discrete components of the cost function canbe softened. The second contribution is to show how the softened cost function can be used to locate matches using continuous nonlinear optimisation. Finally, we show how the resulting graphmatching algorithm relates to the standard quadratic assignment problem. 1 Introduction Graph matching [6, 5, 7, 2, 3, 12, 11J is a topic of central importance in pattern perception. The main computational issues are how to compare inexact relational descriptions (7J and how to search efficiently for the best match [8J. These two issues have recently stimulated interest in the connectionist literature (9, 6, 5, lOJ. For instance, Simic [9], Suganathan et al. (101 and Gold et ai.

artificial intelligence, graph, machine learning, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.90)
Information Technology > Artificial Intelligence > Machine Learning (0.70)

Add feedback

Hyperparameters Evidence and Generalisation for an Unrealisable Rule

Marion, Glenn, Saad, David

Neural Information Processing SystemsDec-31-1995

Using a statistical mechanical formalism we calculate the evidence, generalisation error and consistency measure for a linear perceptron trained and tested on a set of examples generated by a non linear teacher. The teacher is said to be unrealisable because the student can never model it without error. Our model allows us to interpolate between the known case of a linear teacher, and an unrealisable, nonlinear teacher. A comparison of the hyperparameters which maximise the evidence with those that optimise the performance measures reveals that, in the nonlinear case, the evidence procedure is a misleading guide to optimising performance. Finally, we explore the extent to which the evidence procedure is unreliable and find that, despite being sub-optimal, in some circumstances it might be a useful method for fixing the hyperparameters. 1 INTRODUCTION The analysis of supervised learning or learning from examples is a major field of research within neural networks.

evidence procedure, generalisation error, performance measure, (13 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom (0.14)
North America > United States > California > San Mateo County > San Mateo (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback